The Spark Foundation

By: Karan

Task-4 Global Terrorism

Problem statment

● Perform ‘Exploratory Data Analysis’ on dataset ‘Global Terrorism’

● As a security/defense analyst, try to find out the hot zone of terrorism.

● What all security issues and insights you can derive by EDA?

● dataset: https://bit.ly/2TK5Xn5

Importing libraries

In [4]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import folium
import folium.plugins
import io

from matplotlib import animation,rc
import base64

Reading csv

In [5]:
data=pd.read_csv("globalterrorismdb.csv",encoding='ISO-8859-1')
data
c:\users\asus\appdata\local\programs\python\python38-32\lib\site-packages\IPython\core\interactiveshell.py:3062: DtypeWarning: Columns (4,6,31,33,61,62,63,76,79,90,92,94,96,114,115,121) have mixed types.Specify dtype option on import or set low_memory=False.
  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
Out[5]:
eventid iyear imonth iday approxdate extended resolution country country_txt region ... addnotes scite1 scite2 scite3 dbsource INT_LOG INT_IDEO INT_MISC INT_ANY related
0 197000000001 1970 7 2 NaN 0 NaN 58 Dominican Republic 2 ... NaN NaN NaN NaN PGIS 0 0 0 0 NaN
1 197000000002 1970 0 0 NaN 0 NaN 130 Mexico 1 ... NaN NaN NaN NaN PGIS 0 1 1 1 NaN
2 197001000001 1970 1 0 NaN 0 NaN 160 Philippines 5 ... NaN NaN NaN NaN PGIS -9 -9 1 1 NaN
3 197001000002 1970 1 0 NaN 0 NaN 78 Greece 8 ... NaN NaN NaN NaN PGIS -9 -9 1 1 NaN
4 197001000003 1970 1 0 NaN 0 NaN 101 Japan 4 ... NaN NaN NaN NaN PGIS -9 -9 1 1 NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
181686 201712310022 2017 12 31 NaN 0 NaN 182 Somalia 11 ... NaN "Somalia: Al-Shabaab Militants Attack Army Che... "Highlights: Somalia Daily Media Highlights 2 ... "Highlights: Somalia Daily Media Highlights 1 ... START Primary Collection 0 0 0 0 NaN
181687 201712310029 2017 12 31 NaN 0 NaN 200 Syria 10 ... NaN "Putin's 'victory' in Syria has turned into a ... "Two Russian soldiers killed at Hmeymim base i... "Two Russian servicemen killed in Syria mortar... START Primary Collection -9 -9 1 1 NaN
181688 201712310030 2017 12 31 NaN 0 NaN 160 Philippines 5 ... NaN "Maguindanao clashes trap tribe members," Phil... NaN NaN START Primary Collection 0 0 0 0 NaN
181689 201712310031 2017 12 31 NaN 0 NaN 92 India 6 ... NaN "Trader escapes grenade attack in Imphal," Bus... NaN NaN START Primary Collection -9 -9 0 -9 NaN
181690 201712310032 2017 12 31 NaN 0 NaN 160 Philippines 5 ... NaN "Security tightened in Cotabato following IED ... "Security tightened in Cotabato City," Manila ... NaN START Primary Collection -9 -9 0 -9 NaN

181691 rows × 135 columns

Dropping columns where more than 160000 records are empty

In [6]:
data_new = data.dropna(thresh=160000,axis=1)
In [7]:
data_new
Out[7]:
eventid iyear imonth iday extended country country_txt region region_txt provstate ... weapsubtype1_txt nkill nwound property ishostkid dbsource INT_LOG INT_IDEO INT_MISC INT_ANY
0 197000000001 1970 7 2 0 58 Dominican Republic 2 Central America & Caribbean NaN ... NaN 1.0 0.0 0 0.0 PGIS 0 0 0 0
1 197000000002 1970 0 0 0 130 Mexico 1 North America Federal ... NaN 0.0 0.0 0 1.0 PGIS 0 1 1 1
2 197001000001 1970 1 0 0 160 Philippines 5 Southeast Asia Tarlac ... NaN 1.0 0.0 0 0.0 PGIS -9 -9 1 1
3 197001000002 1970 1 0 0 78 Greece 8 Western Europe Attica ... Unknown Explosive Type NaN NaN 1 0.0 PGIS -9 -9 1 1
4 197001000003 1970 1 0 0 101 Japan 4 East Asia Fukouka ... NaN NaN NaN 1 0.0 PGIS -9 -9 1 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
181686 201712310022 2017 12 31 0 182 Somalia 11 Sub-Saharan Africa Middle Shebelle ... Unknown Gun Type 1.0 2.0 -9 0.0 START Primary Collection 0 0 0 0
181687 201712310029 2017 12 31 0 200 Syria 10 Middle East & North Africa Lattakia ... Projectile (rockets, mortars, RPGs, etc.) 2.0 7.0 1 0.0 START Primary Collection -9 -9 1 1
181688 201712310030 2017 12 31 0 160 Philippines 5 Southeast Asia Maguindanao ... Arson/Fire 0.0 0.0 1 0.0 START Primary Collection 0 0 0 0
181689 201712310031 2017 12 31 0 92 India 6 South Asia Manipur ... Grenade 0.0 0.0 -9 0.0 START Primary Collection -9 -9 0 -9
181690 201712310032 2017 12 31 0 160 Philippines 5 Southeast Asia Maguindanao ... Unknown Explosive Type 0.0 0.0 0 0.0 START Primary Collection -9 -9 0 -9

181691 rows × 47 columns

In [8]:
data_new.shape
Out[8]:
(181691, 47)
In [9]:
data_new.isnull().sum()
Out[9]:
eventid                 0
iyear                   0
imonth                  0
iday                    0
extended                0
country                 0
country_txt             0
region                  0
region_txt              0
provstate             421
city                  434
latitude             4556
longitude            4557
specificity             6
vicinity                0
crit1                   0
crit2                   0
crit3                   0
doubtterr               1
multiple                1
success                 0
suicide                 0
attacktype1             0
attacktype1_txt         0
targtype1               0
targtype1_txt           0
targsubtype1        10373
targsubtype1_txt    10373
target1               636
natlty1              1559
natlty1_txt          1559
gname                   0
guncertain1           380
individual              0
weaptype1               0
weaptype1_txt           0
weapsubtype1        20768
weapsubtype1_txt    20768
nkill               10313
nwound              16311
property                0
ishostkid             178
dbsource                0
INT_LOG                 0
INT_IDEO                0
INT_MISC                0
INT_ANY                 0
dtype: int64

EDA

In [10]:
data_new['casualities']=data_new['nkill']+data_new['nwound']

print('Country with Highest Terrorist Attacks:',data_new['country_txt'].value_counts().index[0])
print('Regions with Highest Terrorist Attacks:',data_new['region_txt'].value_counts().index[0])
print('Maximum people killed in an attack are:',data_new['nkill'].max(),'that took place in',data_new.loc[data_new['nkill'].idxmax()].country_txt)
print("Year with the most attacks:",data_new['iyear'].value_counts().idxmax())
print("Month with the most attacks:",data_new['imonth'].value_counts().idxmax())
print("Most Attack Types:",data_new['attacktype1_txt'].value_counts().idxmax())
<ipython-input-10-ba252eeca124>:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  data_new['casualities']=data_new['nkill']+data_new['nwound']
Country with Highest Terrorist Attacks: Iraq
Regions with Highest Terrorist Attacks: Middle East & North Africa
Maximum people killed in an attack are: 1570.0 that took place in Iraq
Year with the most attacks: 2014
Month with the most attacks: 5
Most Attack Types: Bombing/Explosion

Terrorist Groups with most attacks

In [11]:
sns.barplot(data_new['gname'].value_counts()[1:10].values,data_new['gname'].value_counts()[1:10].index,palette='Set1')
plt.xticks(rotation=90)
fig=plt.gcf()
fig.set_size_inches(10,8)
plt.title('Terrorist Groups with Highest Terror Attacks')
plt.show()

Countries with highest terrorist attacks

In [12]:
print(f"The highest terrorist attacks were commited in {data_new.country_txt.value_counts().index[0]} with {data_new.country.value_counts().max()} attacks")

print('\nThe other 9 countries with highest terrorist attacks are:')
for i in range(1,10):
    print(f"{i+1}. {data_new.country_txt.value_counts().index[i]} with {data_new.country_txt.value_counts()[i]} attacks")

#Visualization
plt.subplots(figsize=(15,6))
sns.barplot(data_new['country_txt'].value_counts()[:10].index,data_new['country_txt'].value_counts()[:10].values,palette='Set1')
plt.title('Top Countries Affected')
plt.xlabel('Countries')
plt.ylabel('Count')
plt.xticks(rotation= 90)
plt.show()
The highest terrorist attacks were commited in Iraq with 24636 attacks

The other 9 countries with highest terrorist attacks are:
2. Pakistan with 14368 attacks
3. Afghanistan with 12731 attacks
4. India with 11960 attacks
5. Colombia with 8306 attacks
6. Philippines with 6908 attacks
7. Peru with 6096 attacks
8. El Salvador with 5320 attacks
9. United Kingdom with 5235 attacks
10. Turkey with 4292 attacks

Number Of Terrorist Activities per Year

In [13]:
f, ax = plt.subplots(figsize=(10, 7))
plt.title('Number Of Terrorist Activities per Year')
sns.despine(f)
sns.distplot(data_new['iyear'], bins=20,color="g")
Out[13]:
<matplotlib.axes._subplots.AxesSubplot at 0x1ec95f88>

Here we can observe the number of terrorist activities have gone up sharply after 2010.

Terrorist Activities by Region each Year

In [14]:
pd.crosstab(data_new.iyear, data_new.region_txt).plot(kind='area',figsize=(15,6))
plt.title('Terrorist Activities by Region each Year')
plt.ylabel('Number of Attacks')
plt.show()

Number Of Casualities Each Year

In [15]:
plt.subplots(figsize=(10,7))
year_casual = data_new.groupby('iyear').casualities.sum().to_frame().reset_index()
year_casual.columns = ['Year','Casualities']
plt.title('Number Of Casualities Each Year')
sns.lineplot(x='Year', y='Casualities', data=year_casual,palette="Set2",color="g")
Out[15]:
<matplotlib.axes._subplots.AxesSubplot at 0x2d95e4c0>

Most common target

In [16]:
plt.subplots(figsize=(15,6))
sns.countplot(data_new['targtype1_txt'],palette='Set1',order=data_new['targtype1_txt'].value_counts().index)
plt.xticks(rotation=90)
plt.title('Most common target')
plt.show()

World-wide map of Terrorism

Attacks have been grouped in three different size and three different color.The size and colors are based on the Killed numbers of each attack.You can check this logic in the code below:

In [17]:
terror_fol=data_new.copy()
terror_fol.dropna(subset=['latitude','longitude'],inplace=True)
location_fol=terror_fol[['latitude','longitude']][:8000]
country_fol=terror_fol['country_txt'][:8000]
city_fol=terror_fol['city'][:8000]
killed_fol=terror_fol['nkill'][:8000]
wound_fol=terror_fol['nwound'][:8000]
def color_point(x):
    if x>=30:
        color='red'
    elif ((x>0 and x<30)):
        color='blue'
    else:
        color='orange'
    return color   
def point_size(x):
    if (x>30 and x<100):
        size=2
    elif (x>=100 and x<500):
        size=8
    elif x>=500:
        size=16
    else:
        size=0.5
    return size   
map2 = folium.Map(location=[30,0],tiles='cartodbpositron',zoom_start=2)
for point in location_fol.index:
    info='<b>Country: </b>'+str(country_fol[point])+'<br><b>City: </b>: '+str(city_fol[point])+'<br><b>Killed </b>: '+str(killed_fol[point])+'<br><b>Wounded</b> : '+str(wound_fol[point])
    iframe = folium.IFrame(html=info, width=200, height=200)
    folium.CircleMarker(list(location_fol.loc[point].values),popup=folium.Popup(iframe),radius=point_size(killed_fol[point]),color=color_point(killed_fol[point])).add_to(map2)
map2
Out[17]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Click on markers for more information.

Focussing on India

Fron this part on, we do the same activities and plots, for India.

Terror Activities in India

In [18]:
terror_india=data_new[data_new['country_txt']=='India']
terror_india_fol=terror_india.copy()
terror_india_fol.dropna(subset=['latitude','longitude'],inplace=True)
location_ind=terror_india_fol[['latitude','longitude']][:5000]
city_ind=terror_india_fol['city'][:5000]
killed_ind=terror_india_fol['nkill'][:5000]
wound_ind=terror_india_fol['nwound'][:5000]
target_ind=terror_india_fol['targtype1_txt'][:5000]

map4 = folium.Map(location=[20.59, 78.96],tiles='cartodbpositron',zoom_start=4.5)
for point in location_ind.index:
    folium.CircleMarker(list(location_ind.loc[point].values),popup='<b>City: </b>'+str(city_ind[point])+'<br><b>Killed: </b>'+str(killed_ind[point])+\
                        '<br><b>Injured: </b>'+str(wound_ind[point])+'<br><b>Target: </b>'+str(target_ind[point]),radius=point_size(killed_ind[point]),color=color_point(killed_ind[point]),fill_color=color_point(killed_ind[point])).add_to(map4)
map4
Out[18]:
Make this Notebook Trusted to load map: File -> Trust Notebook
In [ ]:
 
In [ ]: